AITopics | pose error

Collaborating Authors

pose error

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

Gregory Rogez, Cordelia Schmid

Neural Information Processing SystemsMar-23-2026, 06:18:56 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, pose estimation, (19 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.58)

Add feedback

Bias-Eliminated PnP for Stereo Visual Odometry: Provably Consistent and Large-Scale Localization

Zeng, Guangyang, Shen, Yuan, Hong, Ziyang, Hong, Yuze, Ila, Viorela, Shi, Guodong, Wu, Junfeng

arXiv.org Artificial IntelligenceOct-30-2025

--In this paper, we first present a bias-eliminated weighted (Bias-Eli-W) perspective-n-point (PnP) estimator for stereo visual odometry (VO) with provable consistency. Specifically, leveraging statistical theory, we develop an asymptotically unbiased and n-consistent PnP estimator that accounts for varying 3D triangulation uncertainties, ensuring that the relative pose estimate converges to the ground truth as the number of features increases. Next, on the stereo VO pipeline side, we propose a framework that continuously triangulates contemporary features for tracking new frames, effectively decoupling temporal dependencies between pose and 3D point errors. We integrate the Bias-Eli-W PnP estimator into the proposed stereo VO pipeline, creating a synergistic effect that enhances the suppression of pose estimation errors. Experimental results demonstrate that our method: 1) achieves significant improvements in both relative pose error and absolute trajectory error in large-scale environments; 2) provides reliable localization under erratic and unpredictable robot motions. The successful implementation of the Bias-Eli-W PnP in stereo VO indicates the importance of information screening in robotic estimation tasks with high-uncertainty measurements, shedding light on diverse applications where PnP is a key ingredient. Index T erms --Stereo visual odometry, PnP pose estimation, large-scale localization, consistent estimator . ISUAL odometry (VO) refers to estimating the pose of a moving camera in a 3D space from sequential images captured by the camera. The significance of VO stems from its advantages of being infrastructure-free, cost-effective, lightweight, energy-efficient, etc [1, 2, 3]. It enables robots to perceive and navigate their environment autonomously. Compared with monocular VO, stereo VO offers several advantages, such as scale consistency, better accuracy, and enhanced robustness, due to its ability to perceive depth directly [4, 5]. Existing VO methods typically optimize both camera poses and 3D map points simultaneously, with the map being used to track new frames through the perspective-n-point (PnP) algorithm [1, 6, 4].

artificial intelligence, machine learning, odometry, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2025.3614050

2504.1741

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.55)

Add feedback

Freehand 3D Ultrasound Imaging: Sim-in-the-Loop Probe Pose Optimization via Visual Servoing

Zhang, Yameng, Huang, Dianye, Meng, Max Q. -H., Navab, Nassir, Jiang, Zhongliang

arXiv.org Artificial IntelligenceOct-20-2025

Abstract--Freehand 3D ultrasound (US) imaging using conventional 2D probes offers flexibility and accessibility for diverse clinical applications but faces challenges in accurate probe pose estimation. Traditional methods depend on costly tracking systems, while neural network-based methods struggle with image noise and error accumulation, compromising reconstruction precision. We propose a cost-effective and versatile solution that leverages lightweight cameras and visual servoing in simulated environments for precise 3D US imaging. T o counter occlusions and lighting issues, we introduce an image restoration method that reconstructs occluded regions by matching surrounding texture patterns. For pose estimation, we develop a simulation-in-the-loop approach, which replicates the system setup in simulation and iteratively minimizes pose errors between simulated and real-world observations. V alidations on a soft vascular phantom, a 3D-printed conical model, and a human arm demonstrate the robustness and accuracy of our approach, with Hausdorff distances to the reference reconstructions of 0.359 mm, 1.171 mm, and 0.858 mm, respectively. These results confirm the method's potential for reliable freehand 3D US reconstruction. Project resources are available at https://github.com/Y EDICAL ultrasound (US) is widely used in modern clinical practice due to its low cost, real-time imaging, and lack of ionizing radiation. It serves as a first-line tool in various applications, including obstetrics and emergency medicine. This study was partly supported by the Multiscale Medical Robotics Centre, AIR@InnoHK and SINO-German Mobility Project under Grant M0221. Y ameng Zhang is with the Department of Mechanical Engineering, The University of Hong Kong, Hong Kong SAR, China, and also with the Department of Electronic Engineering, The Chinese University of Hong Kong (CUHK), Hong Kong SAR, China (e-mail: zhangyameng@link.cuhk.edu.hk).

artificial intelligence, image understanding, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TMECH.2025.3621347

2510.15668

Country: Asia > China > Hong Kong (0.85)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

DuLoc: Life-Long Dual-Layer Localization in Changing and Dynamic Expansive Scenarios

Jiang, Haoxuan, Qian, Peicong, Xie, Yusen, Li, Xiaocong, Liu, Ming, Ma, Jun

arXiv.org Artificial IntelligenceAug-1-2025

-- LiDAR-based localization serves as a critical component in autonomous systems, yet existing approaches face persistent challenges in balancing repeatability, accuracy, and environmental adaptability. T o address these challenges, this paper proposes DuLoc, a robust and accurate localization method that tightly couples LiDAR-inertial odometry with offline map-based localization, incorporating a constant-velocity motion model to mitigate outlier noise in real-world scenarios. Specifically, we develop a LiDAR-based localization framework that seamlessly integrates a prior global map with dynamic real-time local maps, enabling robust localization in unbounded and changing environments. Extensive real-world experiments in ultra unbounded port that involve 2,856 hours of operational data across 32 Intelligent Guided V ehicles (IGVs) are conducted and reported in this study. The results attained demonstrate that our system outperforms other state-of-the-art LiDAR localization systems in large-scale changing outdoor environments. I. INTRODUCTION High-precision life-long localization in large-scale environments faces fundamental challenges across various autonomous systems [1], [2], [3], [4].

algorithm, artificial intelligence, international conference, (12 more...)

arXiv.org Artificial Intelligence

2507.2366

Country: Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)

Add feedback

An Efficient Method for Accurate Pose Estimation and Error Correction of Cuboidal Objects

Rai, Utsav, Mehta, Hardik, Vakharia, Vismay, Choudhary, Aditya, Parmar, Amit, Lima, Rolif, Das, Kaushik

arXiv.org Artificial IntelligenceMay-9-2025

The proposed system outlined in this paper is a solution to a use case that requires the autonomous picking of cuboidal objects from an organized or unorganized pile with high precision. This paper presents an efficient method for precise pose estimation of cuboid-shaped objects, which aims to reduce errors in target pose in a time-efficient manner. Typical pose estimation methods like global point cloud registrations are prone to minor pose errors for which local registration algorithms are generally used to improve pose accuracy. However, due to the execution time overhead and uncertainty in the error of the final achieved pose, an alternate, linear time approach is proposed for pose error estimation and correction. This paper presents an overview of the solution followed by a detailed description of individual modules of the proposed algorithm.

artificial intelligence, data quality, point cloud, (20 more...)

arXiv.org Artificial Intelligence

2505.04962

Country: Asia > India (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.85)
Information Technology > Data Science > Data Quality > Data Cleaning (0.51)

Add feedback

Keypoint Semantic Integration for Improved Feature Matching in Outdoor Agricultural Environments

de Silva, Rajitha, Cox, Jonathan, Popovic, Marija, Cadena, Cesar, Stachniss, Cyrill, Polvara, Riccardo

arXiv.org Artificial IntelligenceMar-11-2025

Robust robot navigation in outdoor environments requires accurate perception systems capable of handling visual challenges such as repetitive structures and changing appearances. Visual feature matching is crucial to vision-based pipelines but remains particularly challenging in natural outdoor settings due to perceptual aliasing. We address this issue in vineyards, where repetitive vine trunks and other natural elements generate ambiguous descriptors that hinder reliable feature matching. We hypothesise that semantic information tied to keypoint positions can alleviate perceptual aliasing by enhancing keypoint descriptor distinctiveness. To this end, we introduce a keypoint semantic integration technique that improves the descriptors in semantically meaningful regions within the image, enabling more accurate differentiation even among visually similar local features. We validate this approach in two vineyard perception tasks: (i) relative pose estimation and (ii) visual localisation. Across all tested keypoint types and descriptors, our method improves matching accuracy by 12.6%, demonstrating its effectiveness over multiple months in challenging vineyard conditions.

accuracy, descriptor, keypoint, (16 more...)

arXiv.org Artificial Intelligence

2503.08843

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Lincolnshire > Lincoln (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.49)
(3 more...)

Add feedback

CoDiff: Conditional Diffusion Model for Collaborative 3D Object Detection

Huang, Zhe, Wang, Shuo, Wang, Yongcai, Wang, Lei

arXiv.org Artificial IntelligenceFeb-16-2025

-- Collaborative 3D object detection holds significant importance in the field of autonomous driving, as it greatly enhances the perception capabilities of each individual agent by facilitating information exchange among multiple agents. However, in practice, due to pose estimation errors and time delays, the fusion of information across agents often results in feature representations with spatial and temporal noise, leading to detection errors. Diffusion models naturally have the ability to denoise noisy samples to the ideal data, which motivates us to explore the use of diffusion models to address the noise problem between multi-agent systems. In this work, we propose CoDiff, a novel robust collaborative perception framework that leverages the potential of diffusion models to generate more comprehensive and clearer feature representations. T o the best of our knowledge, this is the first work to apply diffusion models to multi-agent collaborative perception. Specifically, we project high-dimensional feature map into the latent space of a powerful pre-trained autoencoder . Within this space, individual agent information serves as a condition to guide the diffusion model's sampling. Experimental study on both simulated and real-world datasets demonstrates that the proposed framework CoDiff consistently outperforms existing relevant methods in terms of the collaborative object detection performance, and exhibits highly desired robustness when the pose and delay information of agents is with high-level noise.

codiff, detection, diffusion model, (13 more...)

arXiv.org Artificial Intelligence

2502.14891

Country:

Oceania > Australia > New South Wales > Wollongong (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.67)
Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection

Wang, Sichao, Zhang, Chuang, Yuan, Ming, Xu, Qing, He, Lei, Wang, Jianqiang

arXiv.org Artificial IntelligenceJan-4-2025

In V2X collaborative perception, the domain gaps between heterogeneous nodes pose a significant challenge for effective information fusion. Pose errors arising from latency and GPS localization noise further exacerbate the issue by leading to feature misalignment. To overcome these challenges, we propose V2X-DGPE, a high-accuracy and robust V2X feature-level collaborative perception framework. V2X-DGPE employs a Knowledge Distillation Framework and a Feature Compensation Module to learn domain-invariant representations from multi-source data, effectively reducing the feature distribution gap between vehicles and roadside infrastructure. Historical information is utilized to provide the model with a more comprehensive understanding of the current scene. Furthermore, a Collaborative Fusion Module leverages a heterogeneous self-attention mechanism to extract and integrate heterogeneous representations from vehicles and infrastructure. To address pose errors, V2X-DGPE introduces a deformable attention mechanism, enabling the model to adaptively focus on critical parts of the input features by dynamically offsetting sampling points. Extensive experiments on the real-world DAIR-V2X dataset demonstrate that the proposed method outperforms existing approaches, achieving state-of-the-art detection performance. The code is available at https://github.com/wangsch10/V2X-DGPE.

artificial intelligence, infrastructure, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.02363

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

An Efficient Scene Coordinate Encoding and Relocalization Method

Xu, Kuan, Jiang, Zeyu, Cao, Haozhi, Yuan, Shenghai, Wang, Chen, Xie, Lihua

arXiv.org Artificial IntelligenceDec-9-2024

Scene Coordinate Regression (SCR) is a visual localization technique that utilizes deep neural networks (DNN) to directly regress 2D-3D correspondences for camera pose estimation. However, current SCR methods often face challenges in handling repetitive textures and meaningless areas due to their reliance on implicit triangulation. In this paper, we propose an efficient scene coordinate encoding and relocalization method. Compared with the existing SCR methods, we design a unified architecture for both scene encoding and salient keypoint detection, enabling our system to focus on encoding informative regions, thereby significantly enhancing efficiency. Additionally, we introduce a mechanism that leverages sequential information during both map encoding and relocalization, which strengthens implicit triangulation, particularly in repetitive texture environments. Comprehensive experiments conducted across indoor and outdoor datasets demonstrate that the proposed system outperforms other state-of-the-art (SOTA) SCR methods. Our single-frame relocalization mode improves the recall rate of our baseline by 6.4% and increases the running speed from 56Hz to 90Hz. Furthermore, our sequence-based mode increases the recall rate by 11% while maintaining the original efficiency.

machine learning, natural language, relocalization, (19 more...)

arXiv.org Artificial Intelligence

2412.06488

Country:

Asia > Singapore (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.97)
(2 more...)

Add feedback

EI-Nexus: Towards Unmediated and Flexible Inter-Modality Local Feature Extraction and Matching for Event-Image Data

Yi, Zhonghua, Shi, Hao, Jiang, Qi, Yang, Kailun, Wang, Ze, Gu, Diyang, Zhang, Yufan, Wang, Kaiwei

arXiv.org Artificial IntelligenceOct-29-2024

Event cameras, with high temporal resolution and high dynamic range, have limited research on the inter-modality local feature extraction and matching of event-image data. We propose EI-Nexus, an unmediated and flexible framework that integrates two modality-specific keypoint extractors and a feature matcher. To achieve keypoint extraction across viewpoint and modality changes, we bring Local Feature Distillation (LFD), which transfers the viewpoint consistency from a well-learned image extractor to the event extractor, ensuring robust feature correspondence. Furthermore, with the help of Context Aggregation (CA), a remarkable enhancement is observed in feature matching. We further establish the first two inter-modality feature matching benchmarks, MVSEC-RPE and EC-RPE, to assess relative pose estimation on event-image data. Our approach outperforms traditional methods that rely on explicit modal transformation, offering more unmediated and adaptable feature extraction and matching, achieving better keypoint similarity and state-of-the-art results on the MVSEC-RPE and EC-RPE benchmarks. The source code and benchmarks will be made publicly available at https://github.com/ZhonghuaYi/EI-Nexus_official.

artificial intelligence, data mining, keypoint, (16 more...)

arXiv.org Artificial Intelligence

2410.21743

Genre: Research Report (0.64)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.83)

Add feedback